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Abstract. We study the relationship between obfuscation and white-box cryptography. We capture 
the requirements of any white-box primitive using a White-Box Property (WBP) and give some nega- 
tive/positive results. Loosely speaking, the WBP is defined for some scheme and a security notion (we 
call the pair a specification), and implies that w.r.t. the specification, an obfuscation does not leak any 
"useful" information, even though it may leak some "useless" non-black-box information. 
Our main result is a negative one - for most interesting programs, an obfuscation (under any definition) 
cannot satisfy the WBP for every specification in which the program may be present. To do this, we de- 
fine a Universal White-Box Property (UWBP), which if satisfied, would imply that under whatever spec- 
ification we conceive, the WBP is satisfied. We then show that for every non-approximately-learnable 
family, there exist certain (contrived) specifications for which the WBP (and thus, the UWBP) fails. 
On the positive side, we show that there exists an obfuscator for a non-approximately-learnable family 
that achieves the WBP for a certain specification. Furthermore, there exists an obfuscator for a non- 
learnable (but approximately-learnable) family that achieves the UWBP. 

Our results can also be viewed as formalizing the distinction between "useful" and "useless" non-black- 
box information. 
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1 Introduction 

Informally, an obfuscator O is a probabilistic compiler that transforms a program P into 0{P), an 
executable implementation of P which hides certain functional characteristics of P. Starting from 
the seminal work of Barak et al. [2], several definitions for obfuscators have been proposed |20|21|24] . 
each one based on some sort of virtual black-box property (VBBP). Loosely speaking, the VBBP 
requires that whatever we could do using the obfuscated program, we could also have done using 
black-box access to the original program. The notion of "whatever" can be captured using several 
formalisms. The following are the common ones (in decreasing order of generality): 

1. Computing something that is indistinguishable from the obfuscation |2|20|21|24] . 

2. Computing some function p]. 

3. Computing some predicate [2]. 

* This work was partly supported by funds from the European Commission through the 1ST Program under Contract 
IST-021186-2 for the RE- TRUST project and in part by the lAP Program P6/26 BCRYPT of the Belgian State 
(Belgian Science Policy). 



1.1 White-box Cryptography 



White-box cryptography (WBC), which requires that some given scheme must remain secure even 
if the adversary is given "white-box access" to a functionality instead of just black-box access, is 
an active field of research. Informally, white-box access implies that the adversary is given an 
executable implementation of the algorithm that was used inside the black-box |llll2j . Existing 
notions of WBC only deal with the encryption algorithm of symmetric block-ciphers. In this work, 
we generalize this intuition to any cryptographic primitive. For instance, we can use WBC to convert 
a MAC into a signature scheme by white-boxing the verification algorithm. 

White-Box Security. The (black-box) security of any primitive is captured using a security 
notion (e.g., IND-CPA) where the adversary is given black-box access to some functionality (e.g., 
encryption), and a white-box implementation can be required to satisfy that security notion when 
the adversary is given access to a white-boxed version of the functionality. 

1.2 Motivation 

One way to realize WBC is to obfuscate (using an obfuscator) the executable code of the algorithm 
and hope that the adversary cannot use it in a non-black-box manner. What we would like is, 
given an obfuscator satisfying some definition, a white-box implementation can be proved secure 
under some security notion. Furthermore, if a scheme is required to satisfy several security notions 
simultaneously (Authenticated Encryption (AE) [4] and the Obfuscated Virtual Machine (OVM) 
of [19] are two such examples, where both confidentiality and integrity needs to be satisfied), we 
would like the obfuscation to ensure that all the security notions are satisfied in the white-box 
variant if they are satisfied in the black-box variant. However, it is still not fully clear if any of the 
existing definitions of obfuscators can be used to achieve these goals. Hence, a natural question is: 

Given an obfuscator satisfying the virtual black-box property for a program P (in some sense), 
and some scheme that is secure when the adversary is given black-box access to P, can it be 
proved (without additional assumptions) that the scheme remains secure when the adversary is 
also given access to the obfuscated program 0(P)? 

1.3 Our Contribution 

1. In this paper we answer the above question in the negative - we show that under whatever 
definition of obfuscation we use, the answer to the above question is, in general, no. To do this, 
we first define the objective(s) of a white-box primitive, which we formalize using a white-box 
property (WBP). Our main observation is that when considering obfuscation of most programs 
P, we must also take into account the scheme plus the security notion (i.e., the specification) 
in which P is used. Furthermore, we show that for most programs P, there cannot exist an 
obfuscator that satisfies the WBP for all specifications in which P might be present. To do 
this, we define a universal white-box property (UWBP) which, if satisfied, would imply that 
in whatever specification P might be present, the obfuscated program 0{P) will not leak any 
"useful" information. We then show that for every non-approximately-learnable program P, 
there exists some specification in which the obfuscation leaks useful information, thereby failing 
the UWBP. 

2. On the positive side, we have the following two results. 

(a) We show that under reasonable computational assumptions, there exists an obfuscator that 
satisfies the WBP w.r.t. some meaningful specification for a non-approximately-learnable 
program P. 



(b) We show that there exist obfuscators that satisfy UWBP for a program that is non-learnable 
but approximately learnable. 

2 Related Work 

Practical white-box implementations of DES and AES encryption algorithms were proposed in [ll|12j 
However, no definitions of obfuscation were given, neither were there any proofs of security. With 
their subsequent cryptanalysis |6|17|25j , it remains an open question whether or not such white-box 
implementations exist. 

The notion of code-obfuscation was first given by Hada in [18], which introduced the concept 
of virtual black-box property (VBBP) using computational indistinguishability. In [2], Barak et al. 
defined obfuscation using the weaker predicate-based VBBP and showed that there exist unobfus- 
catable function families under their definition. Goldwasser and Kalai |13j extend the impossibility 
results of [2] w.r.t. auxiliary inputs. 

On the positive side, there have been several results too. For instance, Lynn et al. show in [22] 
how to obfuscate point functions in the random oracle model. Wee in |24j showed how to obfus- 
cate point functions without random oracles. Hohenberger et al. [21] used a stronger notion of 
obfuscation (average-case secure obfuscation) and showed how it can be used to prove the security 
of re-encryption functionality in a weak security model (i.e., IND-CPA). They also presented a 
re-encryption scheme under bilinear complexity assumptions. Hofheinz et al. [20] discuss a related 
notion of obfuscation and show that IND-CPA encryption and point functions can be securely 
obfuscated in their definition. Goldwasser and Rothblum [16] define the notion of "best-possible 
obfuscation" in order to give a qualitative measure of information leakage by an obfuscation (how- 
ever, they do not differentiate between "useful" and "useless" information). Recently, Canetti and 
Dakdouk [10] give an obfuscator for point functions with multi-bit output for use in primitives 
called "digital lockers". Finally, Herzberg et al. [19| introduce the concept of White-Box Remote 
Program Execution (WBRPE) in order to give a meaningful notion of "software hardening" for all 
programs and avoid the negative results of [2]. 

However, till date, there has not been much work done on the relationship between arbitrary 
white-box primitives and obfuscation. This paper is intended to fill this gap. 

3 Preliminaries 

Denote by P the set of all positive polynomials and by TM the set of all Turing Machines (TMs). 
All TMs considered in this paper are deterministic (a probabilistic TM is simply a deterministic 
TM with randomness on the input tape). A mapping f : x 3 N f{x) € M is negligible in x 
(written f{x) < negl{x)) if Vp G P, 3x' G N,Vx > x' : /(x) < \/p{x). 

For simplicity, we define the input-space of arbitrary TMs to be {0, 1}*, the set of all strings. 
If, however, the input-space of a TM is well defined and efficiently samplable (for instance, the 
strings should be of a particular encoding), then we implicitly imply that the inputs are chosen 
from the input-space sampled using a string from {0, 1}*. All our definitions and results apply in 
this extended setting without any loss of generality. 

Definition 1. In the following, unless otherwise stated, a TM is assumed to have only one input 
tape. 

1. (Equality of TMs.) X,y G TM are equal (written X = Y) if^a : X{a) = Y{a) 

2. (Polynomial TM.) X G TM is a Polynomial TM (PTM) if there exists p G P s.i. Va : X{a) 
halts in at most p{\a\) steps. Denote the set of all PTMs by PTM. 



3. (PPT Algorithms.) A PPT algorithm (such as an adversary or an ohfuscator) is a PTM 
with an unknown source of randomness input via an additional random tape. We denote the set 
of PPT algorithms by PPT. The running time of a PPT algorithm must be polynomial in the 
length of the known inputs. 

4. (TM Family.) A TM Family (TMF) is a TM having two input tapes: a key tape and a standard 
input tape. We denote by TMF the set of all TMFs. Let Q G TMF. Then: 

(a) The symbol Q'^ indicates that the key tape of Q contains string q. 

(b) We denote by JCq the key-space (valid strings for the key tape) of Q. 

(c) Let q G Kq . In our model, the input-space ( valid strings for the standard input tape ) of Q'' is 
fully defined by the parameter \q\. We denote this space byZq ^^^. Furthermore the following 
must hold: 

3p E P,Vg G /Cq,Vx G lQ^\q\ : \x\ =p{\q\). 

5. (Polynomial TM Family.) Q € TMF is a Polynomial TMF (PTMF) if there exists p G P 
such that \/q G /Cg, Va G 2rQ,|g| : Q'^{(i) halts in at most p{\q\) steps. We denote the set of all 
PTMFs by PTMF. 

6. (Learnable Family.) Q G TMF is learnable if3{L,p) G PPT x P s.t. 

yk:Fr[q^{0,l}''nlCQ;X^LQ\l\i\,Q) : X = Q^] > l/p{k) 
( the probability taken over the coin tosses of L) and: 

(a) Va : if Q'^{a) halts after t steps then X{a) halts after at most p{t) steps^ 

(b) \X\<pi\q\). 

L is called the learner for Q. We denote the set of all learnable families by LF. 

7. (Approx. Learnable Family.) Q G TMF is approx. learnable if3{L,p) G PPT x P s.t. 

yk : Pr[(? ^ {0, l}'^ n /Cq; a ^ Ig^k; X ^ LQ\i\i\,Q) : X{a) = Q^(a)] > l/p{k) 
( the probability taken over the coin tosses of L ), and: 

(a) Va : if Q^{a) halts after t steps then X{a) halts after at most p{t) steps. 

(b) \X\<p{\q\). 

We denote the set of all approx. learnable families by ALF. 



Lemma 1. If Qi G PTMF\(A)LF, then the following holds: 

3iQ2,p) G PTMF X P,V(7i G ICQ„3q2 G ICq, : = Qf A \q2\ < p{\qi\) ^ Q2 ^ (A)LF. 



The symbol (A) indicates that A is optional in the above statement. 



Proof. Assume for contradiction that for any given Qi G PTMF that is not learnable, there exists 
some {Q2,p) £ PTMF x poly such that the LHS of the above implication is satisfied but RHS is 
not. Let Li and L2 be the learners for Qi and Q2 respectively. Li runs L2 using its own oracle 
to answer L2's queries. If Q2 is learnable, then L2 will output X2 ~ in a polynomial (of \q2\) 
number of steps, which is a polynomial function of \qi\ by assumption, a contradiction. □ 



This condition is to prevent an exponential time learner from becoming polynomial time by hard-wiring the learning 
algorithm and queries/responses inside X. 



4 Obfuscators 



In this work, we only consider obfuscation of PTMFs with a uniformly selected key, and not of 
a single PTM. As is common in cryptography, we define the functionality of the obfuscator using 
a correctness property and the security using a soundness property. In contrast to existing works, 
however, we define an obfuscator using only the correctness property. This is to consider different 
notions of "white-box" security (which might be unrelated to soundness) and still be able to use 
the word "obfuscator" in a formal sense. 

4.1 Obfuscator (Correctness) 

Definition 2. A randomized algorithm O : PTMF x {0, 1}* i-^ TM satisfies correctness for Q G 
PTMF if the following two properties are satisfied: 

1. Approx. functionality: 

Vg G /CQ,Va G Jq,,,, : Fi[0{Q,q){a) / Q^ia)] < negl{\q\), 

the probability taken over the coin tosses o/oJl 

2. Polynomial slowdown and expansion: There exists p s.t. 

V(/G/Cq : \0{Q,q)\<p{\q\), 

and\/a, if Q'^{a) halts in t steps then 0{Q,q){a) halts in at most p{t) steps. 

We say the O is efficient if O e PPT. 

// O satisfies correctness for Q, we say that O is an obfuscator for Q. 

4.2 Obfuscator (Soundness) 

Over recent years, several definitions of soundness have been proposed, all based on some sort of 
Virtual Black-Box Property (VBBP) |2l2UI21l22l2i] . Let Q G PTMF and let q G {0,1}*. Loosely 
speaking, the VBBP requires that whatever information about q a PPT adversary computes given 
the obfuscation 0{Q,q), a PPT simulator could also have computed using only black-box access 
to Q'^. All existing notions of VBBP can be classified into one of two broad categories. At one 
extreme (the weakest) are the predicate-based definitions, where the adversary and the simulator 
are required to compute some predicate of q. At the other extreme (the strongest) are definitions 
based on computational indistinguishability, where the simulator is required to output something 
that is indistinguishable from 0{Q, q). We define these two notions below. Our definitions are based 
on that of [13], where an auxiliary input is also considered. 

Definition 3. An obfuscator O for Q G PTMF satisfies soundness for Q if at least one of the 
properties given below is satisfied. 

^ For now, we consider the functionality of Q only in a deterministic sense. That is, we do not consider the notion of 
obfuscation of "probabilistic functions" (used, for example, in |20I21) ). However, our negative results (presented in 
Sj6TTJ a-lso apply to probabilistic functions using an appropriately defined notion of probabilistic PTMFs (PPTMFs) 
(and a corresponding notion of approx. functionality for PPTMFs). This aspect will be further discussed in JT] 



1. Predicate Virtual black-box property (PVBBP): Let vr he any efficiently verifiable predi- 
cate on Kq. O satisfies PVBBP for Q if 



y{A,p) G PPT X P, 3{S, k') G PPT X N, VA: > k' : Adv^^^o gik) < negl{k), 



where 



max max 



Pr[g ^ {0, l}'^ n /Cq : AQ' (l^ 0{Q, q),z)= 7r{q)] 
-Fr[q^{0,l}'n}CQ:SQ''{l\z)=7r{q)] 

the probability taken over the coin tosses of O, A, 
2. Computational Indistinguishability (IND): O satisfies IND for Q if 

y{A,p) G PPT X P,3(S', /c') G 

where 



^T X N,VA: > A;' : Adv^X%o,Q{k) < negl{k), 



max 



Pr[g ^ {0, 1}'= n /Cq : AQ\l\0{Q, q),z) = 1] 



Fi[q ^ {0, 1}'= n /Cq : 5'^'(l^ z),z) = 1] 

the probability taken over the coin tosses of 0,A,S. 

Depending on the property satisfied, we call it IND-soundness or PVBBP-soundness (note that the 
former implies the latter). 

It has been noted (but never proved) in several papers (e.g., [2|21j ) that the PVBBP is too 
weak for practical pm'poses. Furthermore, it has been noted that the IND-soundness is too strong 
to be satisfied in practice |21|24j . In fact, it is easy to prove: 

Proposition 1. If there exists an obfuscator satisfying IND-soundness for some Q G PTMF then 
Q G ALfEI 

Nevertheless, it is conceivable that a definition of soundness can be formulated falling somewhere 
between the two extremes, which is neither too weak nor too strong, and can be used for proving 
security of arbitrary white-box primitives. We show this is not the case. Specifically, we show that, 
under whatever definition of soundness we use, for every family Q ^ ALF, there exist (contrived) 
specifications for which white-box security fails but the corresponding black-box construction is 
secure Jl 



5 White-box Cryptography (WBC) 

In this section, we formalize the notion of WBC by defining a white-box property (WBP). A key 
concept of our model is the notion of a (cryptographic) specification. Informally, a specification is 
a self-contained description (in some formal language) of a cryptographic scheme (such as RSA- 
OAEP) along with a corresponding security notion (such as IND-CPA). We follow the basic prin- 
ciples of various "game-based" approaches |3|5|14|15] where a security notion is captured using an 
interactive game between an adversary and a challenger. In our model, the role of the challenger is 
played by an experiment and the corresponding game is called a simulation. We denote by SPEC 
the set of all specifications. 

^ The definition of PVBBP given liere is siigfitly weaker tfian tfie one used in [2] because tliey require tliis property 

to liold for every q, wfiile we require it to liold only for uniformly selected q. 
^ This result does not hold if the definition of approx. functionality in correctness is extended to probabilistic 

functions. See i}7|for details. 

^ In related work, the authors of [20] show that a slightly different notion of the IND property - one based on 
probabilistic functions - is insufficient for proving the white-box IND-CCA2-security of encryption schemes, even 
if white-box IND-CPA is satisfied. Our results are more general because they apply to every Q ^ ALF. 



5.1 Black-Box Simulation 



Let spec £ SPEC denote the specification of some scheme (e.g., "IND-CPA security notion for 
symmetric encryption scheme X"). Every such spec defines a Black-box simulation (or simply sim- 
ulation) between an experiment and an adversary. 

Experiment. The experiment for spec, written Expt*^'^'^ is a TM having six tapes: (1) a 
read-only experiment-input tape, (2) a writable adversary-input tape, (3) a read-only query-input 
tape, (4) a writable query-response tape, (5) a read-only adversary-output tape, and (6) a writable 
experiment- output tape 

Adversary. The adversary A G PPT is an algorithm having four tapes (along with an unknown 
source of randomness via a random input tape): (1) a read-only adversary-input tape, (2) a writable 
query-input tape, (3) a read-only query-response tape, and (4) a writable adversary- output tape 

Simulation. A simulation is an interactive protocol between the experiment and the adversary 
when their tapes coincide, and is started by invoking the experiment via the experiment-input tape. 

— The experiment-input tape contains two inputs: (1) a string of k Is, where A: is a security 
parameter, and (2) a random string r of Pin{k) bits for some pin G P. 

— During the simulation, the experiment and the adversary interact using the common tapes. The 
adversary terminates after writing a string on the adversary-output tape. 

— The simulation ends when the experiment writes a result on the experiment-output tape. 

— We require the result to be either (indicating A lost) or 1 [A won). 

— We denote by Expt^'^'' the simulation, and by Expt2"^'^(l'^, r) the result when the experiment- 
input tape contains (l'^,r). 

— Every experiment must be based on the following template: 

1. Exptr^(l^r): 

2. /* Description of n families Qi, Q2, • • • , Qn € PTMF */ 

3. /* Description of PTM / : {0, ^ xf^^/Cg, */ 

4. {qi,q2,---qn) ^ f{r) 

5. s ^ ^'3i\Q2^--'Qn"(l'=,spec) 

6. If (iiin(r, Query Set, s)) output 1 else output 

The following discussion is based on the above template. 

• We do not allow the oracles used by A to maintain state between successive queried and 
assume that a query takes one unit time irrespective of the amount of computation involved. 

• We require that at any instant A can query at most one oracle. 

• We require that if r is uniformly distributed then so are the keys qi {1 < i < n). 

• The run-time of A is upper-bounded by Prun{k) steps for some pmn G IP (specified in spec). 

• QuerySet is a set representing the queries made by A during the simulation. Each element 
j of this set is an ordered tuple of the type 

{tj,ij,inj,outj) G N X {1,2, ... ,n} X {0,1}* x {0,1}*, 

indicating respectively, the time, oracle number, input, and the output of each query. 

• win is (the PTM description of) an efficiently computable predicate on {r,QuerySet, s). 

• We say that a family Q G spec if Q G {Qi}i<i<n- 

^ If state is to be maintained, for instance, each response to the query must use different randomness (and so a query 
counter must be maintained), then we first assume that adversary can make at most x queries to this oracle, and 
we replicate the oracle x times, each with different randomness. In the winning condition, we test that each such 
oracle was queried at most one time. 



Definition 4. We define 

AdvT'ik) = Pr[r ^ {0, IP"^'^) : £a;p^7"(l^r) = 1], 
the probability taken over the coin tosses of A. 

Definition 5. (Obfuscatable family) For any PTMF Qi G spec, define 

Query Seti = {(tj, ij, inj , out j ) | (t j , ij, in^, outj) € Query Set A ij 7^ z} 

We say that Qi is obfuscatable in spec (written Qi Gobf spec) if 

Vr, QuerySet, s : wiii(r, QuerySet, s) = win(r, QuerySeti, s). 

(In other words, Qi € spec is obfuscatable if every element of QuerySet corresponding to oracle 
Qf can be removed without affecting the win predicate). 

Remark 1. We claim that it is meaningless to talk about white-box security of specifications where 
the PTMF to be white-boxed is not-obfuscatable, since it is impossible to keep track of "queries" 
made by an adversary to an obfuscated program. As an example, it is meaningless to talk about ob- 
fuscating the decryption oracle of an encryption scheme (or the 'signing' oracle of a MAC scheme). 

An example of a specification for the IND-CCA2 notion of some symmetric encryption scheme 
is given in Appendix lAl 



5.2 White-box Simulation 

Let Expt^'^'^ capture the security of some spec G SPEC (using the template of ^5.ip . Let O be an 
obfuscator for some Qi with 1 <i <n such that Qi €ob/ spec. Define the corresponding white-box 
experiment for {spec, Qi) as follows: 

L ExptWB2'"J'^"(l^r) : 

2. /* Description of n families Qi, (52, • • • , <5n € PTMF */ 

3. /* Description of PPT / : {0, ^ x^^^/Cq^ */ 

4. {qi,q2,..^qn) ^ f{r) 

5. s ^ A'^i^ '^"^ {!'' , spec, i, 0{Qi, Qi)) 

6. If (\}in(r, QuerySet, s)) output 1 else output 

— As before, we bound the running time of ExptWB^^''^' to Prun{k) steps. 
Definition 6. We define 

Advwb'X^^^^^{k) = Pr[r ^ {0, If-^^) : ExptW^^'^^'^^l^ ,r) = 1], 
the probability taken over the coin tosses of A,0. 

Definition 7. (White-box Property (WBP)) Let O be an obfuscator for Qi G PTMF and let 
spec be such that Qi (^obf spec. We say that O satisfies WBP for (Qi, spec) if the following holds: 



mm 

AePPT 



AdvwbYo^'ik) - Adv'J^'ik) < negl{\k\). 



The term min 

PT 



Advwb'P^^'^^ik) - AdvY\k) 



is called the white-box advantage of A w.r.t. 



{0,Qi,spec), and serves a measure of "useful information leakage" by the obfuscation. 

Definition 8. (Universal White-box Property (UWBP)) Let O be an obfuscator for Q G 
PTMF. We say that O satisfies UWBP for Q if for every spec G SPEC with Q Gob/ spec, O 
satisfies white-box property for {Q, spec). 



6 WBC and Obfuscation 



In this section we give some useful relationships between obfuscators, WBP and UWBP. 
6.1 Negative results 

We note that Barak et a/.'s impossibility results |2] also apply our definitions. In our model, their 
results can be interpreted as the following: 

There exists a pair {Q,spec) S PTMF x SPEC with Q ^^bf spec such that every obfuscator for 
Q fails to satisfy WBP for {Q,spec). 

In other words, there cannot exist a obfuscator that satisfies UWBP for every Q. However, their 
results do not rule out an obfuscator that satisfies the UWBP for some useful family Q. We show 
that even this is not possible unless Q is at least approx. learnable. 

Result 1: No UWBP For "Interesting" Families. (Informal) Obfuscators satisfying UWBP 
for "interesting" families do not exist. More formally. 

Theorem 1. For every family Q € PTMF\ALF, there exists a (contrived) spec G SPEC such that 
Q Gobf spec but every obfuscator for Q fails to satisfy the WBP for {Q,spec). 

Proof. Let Q € PTMF\ALF. Consider spec = find-q' captured below. 

1. Expt^^"'^-'''(l^r:= {q,q',a)): 



2. Family QlKey q, Input X) { 

3. /* Description of Q (used black-box) */ 

4. } 

5. Family Qi(Key qi := {q,q' , a) , Input Y) { 

6. /* We assume Y £ PTM */ 

7. If (Y{a) = Q''{a)) output q' else output 

8. /* In the above, (the description of) Q is used as a black-box */ 

9. /* y is allowed to run for at most ]5(|a|) steps (for some ]5 € P) */ 

10. } 

11. Function /(Input r) { 

12. Parse r as {q,q',a) 

13. /* we require that a £ '^Q,\q\ A \q'\ = \q\ */ 

14. set qi ^ {q,q',a) 

15. output g, qi 

16. } 

17. q,qi*-f{r) 

18. s^ AQ'^Q'i\i^, find-q') 

19. If (s = q' and at most one query to Q'^) output 1 else output 



Observe that Q £obf find-q'. Since Q ^ ALE, therefore by virtue of Definitions I1I7I and 12111 for 
sufficiently large random q,q',a (and thus, k), the following inequalities are guaranteed to hold: 

e PPT : < Adv^i'"^''^' (k) < a{k) 

G PPT : 1 > Advwb^J''o''^'''^{k) > 1 - p{k), 



where a, (3 are negligible functions. Hence, we have: 



^min^ \Advwb^J''o'''''^{k) - Adv^J'"'^'''' ik)\ > 1 - a{k) - (3{k), 

which is non-negligible in k. This proves the theorem. □ 

Remark 2. The above result applies because the approx. functionality requirement for obfuscators 
(in the correctness definition of ^ is defined only for PTMFs considered as deterministic. What 
about the extended definitions (such as in |20|21j ) which allow probabilistic families? It turns 
out that a similar technique can be used for probabilistic families using appropriately extended 
definitions. This aspect is discussed in ^ 

Remark 3. Although we define ALF to be the set of families which can be approximately-learned 
with a non-negligible advantage (which is quite broad), we note that the above result can be further 
strengthened by narrowing down the definition of ALF to only families that can be approximately- 
learned with an overwhelming advantage. 

Our next result deals with multiple obfuscations. 

Result 2: Simultaneous Obfuscation May Be Insecure. (Informal) Simultaneous obfuscation 
of two families may be insecure even if obfuscation of each family alone is secure. We give a definition 
before stating this formally. 

Definition 9. (Multiple obfuscations) Let spec € SPEC he the specification defined by Expt'^J^^ 
using the above template. Let Qi,Qj (zobf spec for some 1 < i, j < n. Let O be an obfuscator for 
Qi,Qj. Extend the white-box simulation ExptWB^^^Q^ o/ 521 by defining a corresponding simulation 

ExptWB^^^Q^'-' in which A gets as input in Step 5, the tuple {l^ ,i,j,0{Qi,qi),0{Qj,qj)). Finally 
define, 

AdvwbJ'^^^\k) = Pr[r ^ {0, If'-^'^) : ExptW^X'^^'^\l\r) = 1], 

the probability taken over the coin tosses of A,0. 

We say that O satisfies WBP for {{Qi,Qj), spec) if the following holds: 

min^ Advwb'^J'^Q^'-^ (k) — Adv'^^^^{k) < negl{\k\). 



Theorem 2. Let Qi,Qj G PTMF\ALF. Then there exists a spec £ SPEC with Qi,Qj £obf spec 
such that even if there exists an obfuscator for Qi, Q2 satisfying WBP for {Qi, spec) and {Qj, spec), 
every obfuscator fails to satisfy WBP for {{Qi,Qj), spec) 

The proof is similar to the proof of Theorem [TJ 



6.2 Positive Results 

Although the above results rule out the possibility of obfuscators satisfying UWBP for most non- 
trivial families, they do not imply that a meaningful definition of security for white-box cryptogra- 
phy cannot exist. In fact, any asymmetric encryption scheme can be considered as a white-boxed 
version of the corresponding symmetric scheme (where the encryption key is also secret). We use 
this observation as a starting point of our first positive result. A similar observation was used in 
the positive results of [20] . 



Result 3: WBP For "Useful" Families, (informal) There exists a non-approx. learnable family 
(in fact many), and an obfuscator that satisfies WBP for that family under some useful specification. 
This is stated formally in Theorem [3j 

Theorem 3. Under standard computational assumptions, there exists a pair (Q , spec) € PTMF\ALFx 
SPEC with with Q €obf spec and an efficient obfuscator O for Q satisfying WBP for (Q,spec). 

Proof. We prove this using construction. We will use an encryption scheme based on the BF-IBE 
scheme [7j. First we describe a primitive known as a bilinear pairing. Let Gi and G2 be two 
cyclic multiplicative groups both of prime order w such that computing discrete logarithms in Gi 
and G2 is intractable. A bilinear pairing is a map e : Gi x Gi ^ G2 that satisfies the following 
properties |7|8|9j . 

1. Bilinearity: e{a^ , b^) = e{a, b)^^ Va, b € Gi and x, y e Z^. 

2. N on- degeneracy: If is a generator of Gi then e{g,g) is a generator of G2. 

3. Computability: The map e is efficiently computable. 

Define a symmetric encryption scheme £ = {G, E, D) as follows. 

1. Key Generation (G): Let e : Gi x Gi 1— > G2 be a bilinear pairing over cyclic multiplicative 
groups as defined above (such maps are known to exist). Let |Gi| = IG2I = w (prime) such that 

[log2(w)J = I- Pick random g ^ and define H : G2 ^ {0, 1}' to be a hash function. 

Finally pick x ^ Gi and define key = {e,Gi,G2,w,g,?{,x). The encryption/decryption key is 
key. 

2. Encryption (E): The encryption family E using key key is defined as follows. Parse key as 
{e,Gi,G2,w,g,Tl,x). Let m € {0,1}' be a message and a G be a random string. Set 
ici,C2) ^ E'"'y{x,a), where: 

E'^^y : {0,1}' X Z„, 3 {m,a) ^ (?t:(e(x", 5)) m, 5") G {0,1}' x Gi. 

3. Decryption (D): The decryption family D using key key is defined as follows. Parse key as 
(e, Gi, G2,w, g, H, x) and compute m = D^'^y{ci, C2), where: 

D^^y : {0, 1}' x Gi 9 (ci, C2) ^ n{e{c2,x)) © ci G {0, 1}'. 

It can be verified that D^'^y{E^'^y{m, a)) = m for valid values of (m, a) 

The scheme can be proven to be CPA secure if 7^ is a random oracle and w is sufficiently large. 
We construct an obfuscation of the E^^y oracle that converts £ into a CPA secure asymmetric 
encryption scheme under a computational assumption. 
The obfuscator O: The input is {E,key). 

1. Parse key as (e, Gi, G2, g, TC, x) and set y <— e(x, g) G G2. 

2. Set key' <— (e, Gi, G2, ff, y) and define family F with key key' as: 

F'^^y' : {0, 1}' X 9 (m, a) ^ (?^(y") © m, <?") G {0, 1}' x Gi, 

where key' is parsed as {e,Gi,G2,w, g,?{,y). 

3. Output F'^'^?^'. 

Claim. O is an efficient obfuscator for E satisfying WBP for (E,spec), where spec = "IND-CPA 
security of £" , assuming that the bilinear Diffie-Hellman assumption [7] holds in (Gi,G2) and 7i 
can be considered equivalent to a random oracle. 



Proof. We refer the reader to Appendixl^for the formal definition of IND-CPA security. (The IND- 
CPA game is a restricted version of the IND-CCA2 game given there, by adding the additional check 
"no queries to D^'^y" to the win predicate.) 

First note that the obfuscator satisfies correctness for E because 

pkey' ^ pkey_ rp^^ ^^^^^ q£ 

the above claim follows from the security of the BasicPUB encryption scheme of [7]. 

□ 

Claim. If 7^ is a one-way hash function then E G PTMF\ALF. 

Proof. Clearly, F € PTMF and the following holds: 

3p e P,VA:ey e ICE,3key' G ICp : P'^^y' = E'^^y A\key'\ = \key\ +p{\key\). 

By virtue of Lemma [U in order to prove that E ^ ALF, it is sufficient to prove that F ^ ALF. 
Finally, it can be proved that if W is a one-way hash function then indeed F ^ ALfIiI □ 

This completes the proof of Theorem [3l □ 

Remark 4- An interesting observation from Theorem[3]is that even though the obfuscator O satisfies 
WBP for {E,spec), it does not satisfy soundness for E (under Definition [3|) . This indicates that 
the soundness property and WBP are in general independent of each other. 

Remark 5. A reader might wonder why we used the specific encryption scheme in the proof The- 
orem [3l when we could have used just about any asymmetric scheme (such as RSA), or even the 
re-encryption scheme of j21j . We justify our choice with the following reasons: 

1. Why not RSA, El Gamal, etc? 

(a) Textbook RSA does not enjoy the security notion of IND-CPA. Furthermore, even in RSA 
variants that are IND-CPA, it is impossible to prove E ^ ALF without relying on additional 
computational assumptions. 

(b) Encryption in El Gamal (and its variants) is learnable. 

2. Why not re- encryption scheme of 121^? 

The obfuscator of [21] does not satisfy approx. functionality as we define, and so their 
scheme is unsuitable for the proof. (However, the scheme of [21] is the ideal candidate for 
an analogous example of ^) 

6.3 UWBP For Non- Trivial Families 

Let Q € PTMF n LF. Then it is easy to construct an obfuscator satisfying UWBP for Q with a 
non- negligible probability (same as that of learning Q). We call such families trivial. 

Although Result 1 rules out the possibility of an obfuscator satisfying UWBP for some Q G 
PTMF\ALF (which includes most non-trivial families), it does not rule out the possibility of an 
obfuscator satisfying UWBP for some non-trivial family Q G PTMF n ALF (i.e., Q G PTMF n 
ALF\LF). Our next positive result shows that, under reasonable assumptions, this is indeed the 
case. 



Note that for proving IND-CPA security, we need a stronger assumption on TC, namely that it is equivalent to a 
random oracle. However, for proving that E ^ ALF, the assumption that 7i is a one-way hash function is sufHcient. 



Result 4: UWBP for a non-trivial family, (informal) There exists an obfuscator satisfying 
UWBP for a non-trivial (but contrived) family Q. Formally, 



Theorem 4. Under reasonable assumptions, there exists a family Q G PTMF n ALF\LF and an 
obfuscator O for Q that satisfies UWBP for Q. 

Proof. For simplicity, we prove the above result in the random oracle model. Then under the 
assumption that there exist hash functions equivalent to random oracles, our result can be lifted 
to the plain model. 

Consider the family Q defined below: 

1. Family (5(Key q, Input X) { 

2. If Randoin-Oracle|q|(g'||X) = g output 1 else output 

3. } 

Here, Random-Dracle|q| is a random oracle mapping arbitrary strings to |(?|-bit strings. First note 
that indeed Q G PTMF n ALF\LF. It can be proved that MD G PPT (the distinguisher) , 



Pr[6^ {Cllsgcft ^ {0,1}'^ n/Cg : D^^" (l^ go, ft) =b\-\ 



< negl{k), (1) 



the probability taken over the coin tosses of D. For any k, let q <— {0, 1}*^ fl ICq. Consider an 
obfuscator O that takes in as input {Q, q) and simply outputs a description of Q'' as the obfuscation 
of Q'^. Let spec G SPEC be such that Q &obf spec but O does not satisfy WBP for (Q, spec) w.r.t. 
some adversary A G PPT. If A has a non-negligible white-box advantage w.r.t. {0,Q,spec), then 
A can be directly converted into a distinguisher D such that Equation [T] does not hold, thereby 
arriving at a contradiction. □ 



7 The Case Of Probabilistic PTMFs 



In this section, we consider probabilistic (i.e., randomized) functions based on the definitions 
of |20|21j . In contrast to conventional constructions of PTMFs (such as the encryption algorithm of 
the probabilistic encryption scheme in the proof of Theorem [3] and Appendix|X|), where randomness 
is considered as part of the input tape, the definitions of |20l21j consider randomness as part of the 
key tapell We caU a PTMF of the latter type, a probabilistic PTMFs (PPTMF). 

Intuitively, a PPTMF is simply an ordinary PTMF Q with part of the key used for randomness, 
so that two different keys are "equivalent" provided only their random bits are different. 

Formally, a PPTMF is any pair of the type (Q,r), where Q G PTMF and r is an equivalence 
relation on ICq that partitions JCq into equivalence classes, s.t. 

q2 S K,Q ■ t{qi,Q2) = 1 '^=^ only the random bits of qi, q2 are different. 

We denote the set of aU PPTMFs by PPTMF. 
Definition 10. In the following, let (Q, r) G PPTMF. 

1. Let q G JCq. Then: 

* In the construction of [5T], the PTMF has additional randomness on the input tape (a.k.a. 're-randomization 
values', which are supplied by the adversary). We ignore this additional randomness in our discussion (adversary 
cannot be trusted to supply randomness) , since we are focusing on the security of the obfuscator of some given 
PTMF, and not of the specification in which the PTMF is used. 



— For any {a,z) G lQ.\q\ x {0, 1}*, we say that z is r-equal to Q'^{a) (written z =t- Q'^{a)) if 

3q' £}CQ:z = Q'>\a)AT{q,q') = l. 

— For any X E TM we say that X is r-equal to Q'^ (written X =^ Q*^) if 

Va G Zq,|,| : X{a) =r Q'ia). 

2. We define a T-(approx.) learnable family by replacing "=" with "=r" in the definition of (ap- 
prox.) learnable families. The following claim is easy to prove. 

Claim. If Q is not r-approx. learnable then Q ^ ALF. 

3. Let O : PTMF x {0, 1}* ^ TM be a randomized algorithm. Then: 

— O satisfies r-approx. functionality for Q if 

VgG/CQ,VaGlQ,|,| :Pr[0(Q,g)(a) Q'i{a)] < negl{\q\), 

the probability taken over the coin tosses of O. 

— O satisfies r-correctness for Q if DefinitionlM (of ^4-i\ ) holds when "approx. functionality" 
is replaced by "r-approx. functionality". 

— O is a r-obfuscator for Q if it satisfies r-correctness for Q. 

4. Q is r-decidable if there exist p,p' G P and for every /c G N, there exists an efficiently computable 
map 

ik : {0, If ^ {0, 1}'^ n /Cq X PTM, 

such that for all {q,Z) <— ffc(r), the following holds: 

— '^a,ze {0, 1}* : Z{a, z) = 1 <^ z =r Q''{a). 

— If r is uniformly distributed then so is q. 

— \Z\ <p'{k). 

We claim that in any meaningful PPTMF construction (Q, r), the family Q must be r-decidable 
(this is true for the constructions of |20|21] ). Theorem [5] below is the equivalent of Theorem [1] for 
PPTMFs. The proof follows directly after replacing "=" with "=r" in the proof of Theorem [TJ 

Theorem 5. For every {Q,r) G PPTMF with Q r-decidable but not r-approx. learnable, there 
exists spec G SPEC such that Q ^obf spec but every r-obfuscator for Q fails to satisfy the WBP for 
{Q, spec). 

7.1 An Open Question: WBP and Soundness 

Let {{Q,r),spec) G PPTMF x SPEC such that the following is true: 

1. Q is not r-approx. learnable. 

2. Q €obf spec 

3. Q is r-decidable 

4. O is a r-obfuscator for Q satisfying IND-soundness ( §4.21 Definition [3]) . 



A useful question is: Given spec, can we decide if O satisfies WBP for {Q, spec) ? 

Why is it useful? Ideally, we would like the WBP to be satisfied. However, WBP is defined 
w.r.t. a family and a specification while soundness is defined w.r.t. a family, independent of the 
specification. On the one hand, due to this simplified definition, obfuscator designers may find it 
appealing. On the other hand, it is possible that IND-soundness may be too strong to be satisfied 
even though WBP is (w.r.t. to some specification), as in the example in the proof of Theorem [3l 
Nevertheless, we consider it an interesting question to characterize cases when the WBP can be 
reduced to IND-soundness. The assumption that O is a r-obfuscator (rather than an obfuscator) 
for Q is necessary to falsify Proposition [H which rules out interesting families. 

An Open Question: If spec is such that the only oracle available to A is the one being obfuscated, 
then the WBP for spec indeed holds if IND-soundness holds. The result also holds if spec has 
additional oracles that always output the same string (which can be given as an auxiliary input to 
the distinguisher - cf. "distinguishable attack property" of [H]). At this stage, an open question is: 
how to characterize specs where A is given additional oracles which output query- dependent strings? 

8 Conclusion 

In this work, we initiated a formal study of White-Box Cryptography (WBC) and investigated its 
relationship with obfuscation. We presented definitions and (im)possibility results for obfuscators of 
specific classes of 'interesting' programs families - those that are not approx. learnable. The security 
requirements of WBC is captured by means of a White-Box Property (WBP), which is defined for 
some scheme and a security notion (we call the pair a specification). The security requirement of 
an obfuscator is captured using a soundness property. We showed that WBP and soundness are 
in general quite independent of each other by giving some examples where one is satisfied but the 
other is not. 

Although the WBP is defined for a particular (family, specification) pair, soundness is only 
defined for a given family and is independent of the specification. A natural question is whether 
there exist non-trivial families for which the WBP w.r.t. every specification can be reduced to 
the soundness of an obfuscator for that family. Loosely speaking, an obfuscator that achieves this 
is said to satisfy the Universal White-Box Property (UWBP) for that family. We showed that the 
UWBP fails for every family that is not approx. learnable. However, we show that under reasonable 
assumptions there exists an obfuscator O satisfying UWBP for a non-learnable but approx. learnable 
family. Furthermore, the specification we used for our negative result is quite contrived. Hence, it 
seems reasonable to expect that a meaningful notion of security for WBC based on WBP can still 
be achieved for "normal" specifications. As a possible example of this, we presented a (non-trivial) 
non-approx. learnable family for which there does exist an obfuscator satisfying the WBP in a real- 
world specification. Additionally, we showed that there exists a (contrived) family Q S ALF\LF for 
which there exists an obfuscator satisfying UWBP for Q. 
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APPENDIX 



A An Example Specification 



Let £ = (G, -E, D) be a symmetric encryption scheme. We define the IND-CCA2 specification using 
the fohowing simulation. The corresponding specification is cahed ind-cca2-£ . In the following, the 
key generation algorithm, G takes in as input the security parameter (l'^) and a k bit random string 
7. It outputs a k bit encryption/decryption key key. 



1. Expt^_^-'^™2-£:(;L*^^y,) . 

2. Family E(key /cey. Input {a,m)) { 

3. /* a is randomness */ 

4. output E{key, a, m) 

5. } 

6. Family D(key key, Input c) { 

7. output D[key,c) 

8. } 

9. Family C(key (6, fee?/, /3) , Input (mo, mi)) { 

10. /* C is the challenge oracle, h € {0, 1} is a bit. (5 is randomness */ 

11. output E{key, P,mb) 

12. } 

13. Function /(Input r) { 

14. /* / : {0, Ij^'^+i {0, 1}'' X {0, 1}'' X {0, 1}2'=+^ */ 

15. parse r as (7,/?, 6) 

16. A;ey ^ G(iItI,7) 

17. output key, key, {b, key, (3) 

18. } 

19. key, key, {b, key, (3) ^ f{r) 

20. s ^ ^E'=«^D'==^c(^^=^./'> ^i\ind-cca2-£) 

21. If (.vinir, Query Set, s)) output 1 else output 

Here win := "If (At most one query to C^^^'^'^v^l^)) A (No query to J^'^'^y on output of C^^-^^^V'P) after 



query to C^^^^^y^^)) A (s = 6)". 
Clearly, E Go^j ind-cca2-£ . 



